Comparison of Algorithmic and Human Assessments of Sentence Similarity

نویسندگان

  • John G. Mersch
  • R. Raymond Lang
چکیده

This paper describes a new method, based on information theory, for measuring sentence similarity. The method first computes the information content (IC) of dependency triples using corpus statistics generated by processing the Open American National Corpus (OANC) with the Stanford Parser. We define the similarity of two sentences as a function of (1) the similarity of their constituent dependency triples, and (2) the position of the triples in their respective dependency trees. We apply the algorithm to 15 pairs of sentences that were also given to human subjects to assign a similarity score. The humanand computer-generated scores are compared; the results are promising, but point to the need for further refinement.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sentence generation for artificial brains: A glocal similarity-matching approach

A novel approach to sentence generation – SegSim, Sentence Generation by Similarity Matching – is outlined, and is argued to possess a number of desirable properties making it plausible as a model of sentence generation in the human brain, and useful as a guide for creating sentence generation components within artificial brains. The crux of the approach is to do as much as possible via similar...

متن کامل

مقایسه روش‌های مختلف یادگیری ماشین در خلاصه‌سازی استخراجی گفتار به گفتار فارسی بدون استفاده از رونوشت

In this paper, extractive speech summarization using different machine learning algorithms was investigated. The task of Speech summarization deals with extracting important and salient segments from speech in order to access, search, extract and browse speech files easier and in a less costly manner. In this paper, a new method for speech summarization without using automatic speech recognitio...

متن کامل

The Role of Algorithmic Applications in the Development of Architectural Forms (Case Study:Nine High-Rise Buildings)

The process of developing architectural forms has greatly been changed by advances in digital technology, especially in design tools and applications. In recent years, the advent of graphical scripting languages in the design process has profoundly affected 3D modeling. Scripting languages help develop algorithms and geometrical grammar of shapes based on their constituent parameters. This stud...

متن کامل

Parallel Genetic Algorithm Using Algorithmic Skeleton

Algorithmic skeleton has received attention as an efficient method of parallel programming in recent years. Using the method, the programmer can implement parallel programs easily. In this study, a set of efficient algorithmic skeletons is introduced for use in implementing parallel genetic algorithm (PGA).A performance modelis derived for each skeleton that makes the comparison of skeletons po...

متن کامل

A Sentence Semantic Similarity Calculating Method Based on Segmented Semantic Comparison

In order to calculate sentence semantic similarity more accurately, a sentence semantic similarity calculating method based on segmented semantic comparison was proposed. Sentences would be divided into the trunk and the other segments by some grammar rules, and each segment might be divided into several shorter segments. When calculating the sentence semantic similarity between two sentences, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013